Goto

Collaborating Authors

 engineering design


Intelligent Design 4.0: Paradigm Evolution Toward the Agentic AI Era

Jiang, Shuo, Xie, Min, Chen, Frank Youhua, Ma, Jian, Luo, Jianxi

arXiv.org Artificial Intelligence

Research and practice in Intelligent Design (ID) have significantly enhanced engineering innovation, efficiency, quality, and productivity over recent decades, fundamentally reshaping how engineering designers think, behave, and interact with design processes. The recent emergence of Foundation Models (FMs), particularly Large Language Models (LLMs), has demonstrated general knowledge-based reasoning capabilities, and open new avenues for further transformation in engineering design. In this context, this paper introduces Intelligent Design 4.0 (ID 4.0) as an emerging paradigm empowered by foundation model-based agentic AI systems. We review the historical evolution of ID across four distinct stages: rule-based expert systems, task-specific machine learning models, large-scale foundation AI models, and the recent emerging paradigm of foundation model-based multi-agent collaboration. We propose an ontological framework for ID 4.0 and discuss its potential to support end-to-end automation of engineering design processes through coordinated, autonomous multi-agent-based systems. Furthermore, we discuss challenges and opportunities of ID 4.0, including perspectives on data foundations, agent collaboration mechanisms, and the formulation of design problems and objectives. In sum, these insights provide a foundation for advancing Intelligent Design toward greater adaptivity, autonomy, and effectiveness in addressing the growing complexity of engineering design.


Agentic Large Language Models for Conceptual Systems Engineering and Design

Massoudi, Soheyl, Fuge, Mark

arXiv.org Artificial Intelligence

Early-stage engineering design involves complex, iterative reasoning, yet existing large language model (LLM) workflows struggle to maintain task continuity and generate executable models. We evaluate whether a structured multi-agent system (MAS) can more effectively manage requirements extraction, functional decomposition, and simulator code generation than a simpler two-agent system (2AS). The target application is a solar-powered water filtration system as described in a cahier des charges. We introduce the Design-State Graph (DSG), a JSON-serializable representation that bundles requirements, physical embodiments, and Python-based physics models into graph nodes. A nine-role MAS iteratively builds and refines the DSG, while the 2AS collapses the process to a Generator-Reflector loop. Both systems run a total of 60 experiments (2 LLMs - Llama 3.3 70B vs reasoning-distilled DeepSeek R1 70B x 2 agent configurations x 3 temperatures x 5 seeds). We report a JSON validity, requirement coverage, embodiment presence, code compatibility, workflow completion, runtime, and graph size. Across all runs, both MAS and 2AS maintained perfect JSON integrity and embodiment tagging. Requirement coverage remained minimal (less than 20%). Code compatibility peaked at 100% under specific 2AS settings but averaged below 50% for MAS. Only the reasoning-distilled model reliably flagged workflow completion. Powered by DeepSeek R1 70B, the MAS generated more granular DSGs (average 5-6 nodes) whereas 2AS mode-collapsed. Structured multi-agent orchestration enhanced design detail. Reasoning-distilled LLM improved completion rates, yet low requirements and fidelity gaps in coding persisted.


Large Language Models for Combinatorial Optimization of Design Structure Matrix

Jiang, Shuo, Xie, Min, Luo, Jianxi

arXiv.org Artificial Intelligence

In complex engineering systems, the dependencies among components or development activities are often modeled and analyzed using Design Structure Matrix (DSM). Reorganizing elements within a DSM to minimize feedback loops and enhance modularity or process efficiency constitutes a challenging combinatorial optimization (CO) problem in engineering design and operations. As problem sizes increase and dependency networks become more intricate, traditional optimization methods that rely solely on mathematical heuristics often fail to capture the contextual nuances and struggle to deliver effective solutions. In this study, we explore the potential of Large Language Models (LLMs) to address such CO problems by leveraging their capabilities for advanced reasoning and contextual understanding. We propose a novel LLM-based framework that integrates network topology with contextual domain knowledge for iterative optimization of DSM sequencing-a common CO problem. Experiments on various DSM cases demonstrate that our method consistently achieves faster convergence and superior solution quality compared to both stochastic and deterministic baselines. Notably, incorporating contextual domain knowledge significantly enhances optimization performance regardless of the chosen LLM backbone. These findings highlight the potential of LLMs to solve complex engineering CO problems by combining semantic and mathematical reasoning. This approach paves the way towards a new paradigm in LLM-based engineering design optimization.


BikeBench: A Bicycle Design Benchmark for Generative Models with Objectives and Constraints

Regenwetter, Lyle, Obaideh, Yazan Abu, Chiotti, Fabien, Lykourentzou, Ioanna, Ahmed, Faez

arXiv.org Artificial Intelligence

We introduce BikeBench, an engineering design benchmark for evaluating generative models on problems with multiple real-world objectives and constraints. As generative AI's reach continues to grow, evaluating its capability to understand physical laws, human guidelines, and hard constraints grows increasingly important. Engineering product design lies at the intersection of these difficult tasks, providing new challenges for AI capabilities. BikeBench evaluates AI models' capabilities to generate bicycle designs that not only resemble the dataset, but meet specific performance objectives and constraints. To do so, BikeBench quantifies a variety of human-centered and multiphysics performance characteristics, such as aerodynamics, ergonomics, structural mechanics, human-rated usability, and similarity to subjective text or image prompts. Supporting the benchmark are several datasets of simulation results, a dataset of 10,000 human-rated bicycle assessments, and a synthetically generated dataset of 1.6M designs, each with a parametric, CAD/XML, SVG, and PNG representation. BikeBench is uniquely configured to evaluate tabular generative models, large language models (LLMs), design optimization, and hybrid algorithms side-by-side. Our experiments indicate that LLMs and tabular generative models fall short of hybrid GenAI+optimization algorithms in design quality, constraint satisfaction, and similarity scores, suggesting significant room for improvement. We hope that BikeBench, a first-of-its-kind benchmark, will help catalyze progress in generative AI for constrained multi-objective engineering design problems. We provide code, data, an interactive leaderboard, and other resources at https://github.com/Lyleregenwetter/BikeBench.


Human-in-the-Loop: Quantitative Evaluation of 3D Models Generation by Large Language Models

Sadik, Ahmed R., Bujny, Mariusz

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly capable of interpreting multimodal inputs to generate complex 3D shapes, yet robust methods to evaluate geometric and structural fidelity remain underdeveloped. This paper introduces a human-in-the-loop framework for the quantitative evaluation of LLM-generated 3D models, supporting applications such as democratization of CAD design, reverse engineering of legacy designs, and rapid prototyping. We propose a comprehensive suite of similarity and complexity metrics--including volumetric accuracy, surface alignment, dimensional fidelity, and topological intricacy--to benchmark generated models against ground-truth CAD references. Using an L-bracket component as a case study, we systematically compare LLM performance across four input modalities: 2D orthographic views, isometric sketches, geometric structure trees, and code-based correction prompts. Our findings demonstrate improved generation fidelity with increased semantic richness, with code-level prompts achieving perfect reconstruction across all metrics. A key contribution of this work is demonstrating that our proposed quantitative evaluation approach enables significantly faster convergence toward the ground truth, especially compared to traditional qualitative methods based solely on visual inspection and human intuition. This work not only advances the understanding of AI-assisted shape synthesis but also provides a scalable methodology to validate and refine generative models for diverse CAD applications.


RECALL-MM: A Multimodal Dataset of Consumer Product Recalls for Risk Analysis using Computational Methods and Large Language Models

Bolanos, Diana, Ataei, Mohammadmehdi, Grandi, Daniele, Goucher-Lambert, Kosa

arXiv.org Artificial Intelligence

Product recalls provide valuable insights into potential risks and hazards within the engineering design process, yet their full potential remains underutilized. In this study, we curate data from the United States Consumer Product Safety Commission (CPSC) recalls database to develop a multimodal dataset, RECALL-MM, that informs data-driven risk assessment using historical information, and augment it using generative methods. Patterns in the dataset highlight specific areas where improved safety measures could have significant impact. We extend our analysis by demonstrating interactive clustering maps that embed all recalls into a shared latent space based on recall descriptions and product names. Leveraging these data-driven tools, we explore three case studies to demonstrate the dataset's utility in identifying product risks and guiding safer design decisions. The first two case studies illustrate how designers can visualize patterns across recalled products and situate new product ideas within the broader recall landscape to proactively anticipate hazards. In the third case study, we extend our approach by employing a large language model (LLM) to predict potential hazards based solely on product images. This demonstrates the model's ability to leverage visual context to identify risk factors, revealing strong alignment with historical recall data across many hazard categories. However, the analysis also highlights areas where hazard prediction remains challenging, underscoring the importance of risk awareness throughout the design process. Collectively, this work aims to bridge the gap between historical recall data and future product safety, presenting a scalable, data-driven approach to safer engineering design.


Dynamic semantic networks for exploration of creative thinking

Georgiev, Danko D., Georgiev, Georgi V.

arXiv.org Artificial Intelligence

Human creativity originates from brain cortical networks that are specialized in idea generation, processing, and evaluation. The concurrent verbalization of our inner thoughts during the execution of a design task enables the use of dynamic semantic networks as a tool for investigating, evaluating, and monitoring creative thought. The primary advantage of using lexical databases such as WordNet for reproducible information-theoretic quantification of convergence or divergence of design ideas in creative problem solving is the simultaneous handling of both words and meanings, which enables interpretation of the constructed dynamic semantic networks in terms of underlying functionally active brain cortical regions involved in concept comprehension and production. In this study, the quantitative dynamics of semantic measures computed with a moving time window is investigated empirically in the DTRS10 dataset with design review conversations and detected divergent thinking is shown to predict success of design ideas. Thus, dynamic semantic networks present an opportunity for real-time computer-assisted detection of critical events during creative problem solving, with the goal of employing this knowledge to artificially augment human creativity.


Data Publishing in Mechanics and Dynamics: Challenges, Guidelines, and Examples from Engineering Design

Ebel, Henrik, van Delden, Jan, Lüddecke, Timo, Borse, Aditya, Gulakala, Rutwik, Stoffel, Marcus, Yadav, Manish, Stender, Merten, Schindler, Leon, de Payrebrune, Kristin Miriam, Raff, Maximilian, Remy, C. David, Röder, Benedict, Raj, Rohit, Rentschler, Tobias, Tismer, Alexander, Riedelbauch, Stefan, Eberhard, Peter

arXiv.org Artificial Intelligence

Data-based methods have gained increasing importance in engineering, especially but not only driven by successes with deep artificial neural networks. Success stories are prevalent, e.g., in areas such as data-driven modeling, control and automation, as well as surrogate modeling for accelerated simulation. Beyond engineering, generative and large-language models are increasingly helping with tasks that, previously, were solely associated with creative human processes. Thus, it seems timely to seek artificial-intelligence-support for engineering design tasks to automate, help with, or accelerate purpose-built designs of engineering systems, e.g., in mechanics and dynamics, where design so far requires a lot of specialized knowledge. However, research-wise, compared to established, predominantly first-principles-based methods, the datasets used for training, validation, and test become an almost inherent part of the overall methodology. Thus, data publishing becomes just as important in (data-driven) engineering science as appropriate descriptions of conventional methodology in publications in the past. This article analyzes the value and challenges of data publishing in mechanics and dynamics, in particular regarding engineering design tasks, showing that the latter raise also challenges and considerations not typical in fields where data-driven methods have been booming originally. Possible ways to deal with these challenges are discussed and a set of examples from across different design problems shows how data publishing can be put into practice. The analysis, discussions, and examples are based on the research experience made in a priority program of the German research foundation focusing on research on artificially intelligent design assistants in mechanics and dynamics.


Parametric-ControlNet: Multimodal Control in Foundation Models for Precise Engineering Design Synthesis

Zhou, Rui, Zhang, Yanxia, Yuan, Chenyang, Permenter, Frank, Arechiga, Nikos, Klenk, Matt, Ahmed, Faez

arXiv.org Artificial Intelligence

This paper introduces a generative model designed for multimodal control over text-to-image foundation generative AI models such as Stable Diffusion, specifically tailored for engineering design synthesis. Our model proposes parametric, image, and text control modalities to enhance design precision and diversity. Firstly, it handles both partial and complete parametric inputs using a diffusion model that acts as a design autocomplete co-pilot, coupled with a parametric encoder to process the information. Secondly, the model utilizes assembly graphs to systematically assemble input component images, which are then processed through a component encoder to capture essential visual data. Thirdly, textual descriptions are integrated via CLIP encoding, ensuring a comprehensive interpretation of design intent. These diverse inputs are synthesized through a multimodal fusion technique, creating a joint embedding that acts as the input to a module inspired by ControlNet. This integration allows the model to apply robust multimodal control to foundation models, facilitating the generation of complex and precise engineering designs. This approach broadens the capabilities of AI-driven design tools and demonstrates significant advancements in precise control based on diverse data modalities for enhanced design generation.


VehicleSDF: A 3D generative model for constrained engineering design via surrogate modeling

Morita, Hayata, Shintani, Kohei, Yuan, Chenyang, Permenter, Frank

arXiv.org Artificial Intelligence

A main challenge in mechanical design is to efficiently explore the design space while satisfying engineering constraints. This work explores the use of 3D generative models to explore the design space in the context of vehicle development, while estimating and enforcing engineering constraints. Specifically, we generate diverse 3D models of cars that meet a given set of geometric specifications, while also obtaining quick estimates of performance parameters such as aerodynamic drag. For this, we employ a data-driven approach (using the ShapeNet dataset) to train VehicleSDF, a DeepSDF based model that represents potential designs in a latent space witch can be decoded into a 3D model. We then train surrogate models to estimate engineering parameters from this latent space representation, enabling us to efficiently optimize latent vectors to match specifications. Our experiments show that we can generate diverse 3D models while matching the specified geometric parameters. Finally, we demonstrate that other performance parameters such as aerodynamic drag can be estimated in a differentiable pipeline.